Overview
Brought to you by YData
Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 270000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 33.0 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 5 |
| Boolean | 1 |
age is highly overall correlated with cred_hist_length and 1 other fields | High correlation |
cred_hist_length is highly overall correlated with age and 1 other fields | High correlation |
emp_exp is highly overall correlated with age and 1 other fields | High correlation |
loan_amount is highly overall correlated with percent_income | High correlation |
loan_id is highly overall correlated with person_id | High correlation |
loan_status is highly overall correlated with previous_defaults | High correlation |
percent_income is highly overall correlated with loan_amount | High correlation |
person_id is highly overall correlated with loan_id | High correlation |
previous_defaults is highly overall correlated with loan_status | High correlation |
income is highly skewed (γ1 = 34.13663485) | Skewed |
loan_id is uniformly distributed | Uniform |
person_id is uniformly distributed | Uniform |
loan_id has unique values | Unique |
person_id has unique values | Unique |
emp_exp has 57396 (21.3%) zeros | Zeros |
Reproduction
| Analysis started | 2024-12-21 23:30:37.565767 |
|---|---|
| Analysis finished | 2024-12-21 23:31:04.855742 |
| Duration | 27.29 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
loan_id
Real number (ℝ)
High correlation  Uniform  Unique 
| Distinct | 270000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 135000.5 |
| Minimum | 1 |
|---|---|
| Maximum | 270000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 13500.95 |
| Q1 | 67500.75 |
| median | 135000.5 |
| Q3 | 202500.25 |
| 95-th percentile | 256500.05 |
| Maximum | 270000 |
| Range | 269999 |
| Interquartile range (IQR) | 134999.5 |
Descriptive statistics
| Standard deviation | 77942.431 |
|---|---|
| Coefficient of variation (CV) | 0.5773492 |
| Kurtosis | -1.2 |
| Mean | 135000.5 |
| Median Absolute Deviation (MAD) | 67500 |
| Skewness | 0 |
| Sum | 3.6450135 × 1010 |
| Variance | 6.0750225 × 109 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 179990 | 1 | < 0.1% |
| 179992 | 1 | < 0.1% |
| 179993 | 1 | < 0.1% |
| 179994 | 1 | < 0.1% |
| 179995 | 1 | < 0.1% |
| 179996 | 1 | < 0.1% |
| 179997 | 1 | < 0.1% |
| 179998 | 1 | < 0.1% |
| 179999 | 1 | < 0.1% |
| Other values (269990) | 269990 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 270000 | 1 | |
| 269999 | 1 | |
| 269998 | 1 | |
| 269997 | 1 | |
| 269996 | 1 | |
| 269995 | 1 | |
| 269994 | 1 | |
| 269993 | 1 | |
| 269992 | 1 | |
| 269991 | 1 |
person_id
Real number (ℝ)
High correlation  Uniform  Unique 
| Distinct | 270000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 135000.5 |
| Minimum | 1 |
|---|---|
| Maximum | 270000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 13500.95 |
| Q1 | 67500.75 |
| median | 135000.5 |
| Q3 | 202500.25 |
| 95-th percentile | 256500.05 |
| Maximum | 270000 |
| Range | 269999 |
| Interquartile range (IQR) | 134999.5 |
Descriptive statistics
| Standard deviation | 77942.431 |
|---|---|
| Coefficient of variation (CV) | 0.5773492 |
| Kurtosis | -1.2 |
| Mean | 135000.5 |
| Median Absolute Deviation (MAD) | 67500 |
| Skewness | 0 |
| Sum | 3.6450135 × 1010 |
| Variance | 6.0750225 × 109 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 179990 | 1 | < 0.1% |
| 179992 | 1 | < 0.1% |
| 179993 | 1 | < 0.1% |
| 179994 | 1 | < 0.1% |
| 179995 | 1 | < 0.1% |
| 179996 | 1 | < 0.1% |
| 179997 | 1 | < 0.1% |
| 179998 | 1 | < 0.1% |
| 179999 | 1 | < 0.1% |
| Other values (269990) | 269990 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 270000 | 1 | |
| 269999 | 1 | |
| 269998 | 1 | |
| 269997 | 1 | |
| 269996 | 1 | |
| 269995 | 1 | |
| 269994 | 1 | |
| 269993 | 1 | |
| 269992 | 1 | |
| 269991 | 1 |
age
Real number (ℝ)
High correlation 
| Distinct | 60 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.764178 |
| Minimum | 20 |
|---|---|
| Maximum | 144 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 22 |
| Q1 | 24 |
| median | 26 |
| Q3 | 30 |
| 95-th percentile | 39 |
| Maximum | 144 |
| Range | 124 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 6.0450522 |
|---|---|
| Coefficient of variation (CV) | 0.21772848 |
| Kurtosis | 18.647611 |
| Mean | 27.764178 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 2.5480832 |
| Sum | 7496328 |
| Variance | 36.542657 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 31524 | |
| 24 | 30828 | |
| 25 | 27042 | |
| 22 | 25416 | |
| 26 | 21954 | 8.1% |
| 27 | 18570 | 6.9% |
| 28 | 16368 | 6.1% |
| 29 | 14730 | 5.5% |
| 30 | 12126 | 4.5% |
| 31 | 9870 | 3.7% |
| Other values (50) | 61572 |
| Value | Count | Frequency (%) |
| 20 | 102 | < 0.1% |
| 21 | 7734 | 2.9% |
| 22 | 25416 | |
| 23 | 31524 | |
| 24 | 30828 | |
| 25 | 27042 | |
| 26 | 21954 | |
| 27 | 18570 | |
| 28 | 16368 | |
| 29 | 14730 |
| Value | Count | Frequency (%) |
| 144 | 18 | |
| 123 | 12 | |
| 116 | 6 | < 0.1% |
| 109 | 6 | < 0.1% |
| 94 | 6 | < 0.1% |
| 84 | 6 | < 0.1% |
| 80 | 6 | < 0.1% |
| 78 | 6 | < 0.1% |
| 76 | 6 | < 0.1% |
| 73 | 18 |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.8959556 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | female |
|---|---|
| 2nd row | female |
| 3rd row | female |
| 4th row | female |
| 5th row | male |
Common Values
| Value | Count | Frequency (%) |
| male | 149046 | |
| female | 120954 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 149046 | |
| female | 120954 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 390954 | |
| m | 270000 | |
| a | 270000 | |
| l | 270000 | |
| f | 120954 | 9.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1321908 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 390954 | |
| m | 270000 | |
| a | 270000 | |
| l | 270000 | |
| f | 120954 | 9.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1321908 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 390954 | |
| m | 270000 | |
| a | 270000 | |
| l | 270000 | |
| f | 120954 | 9.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1321908 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 390954 | |
| m | 270000 | |
| a | 270000 | |
| l | 270000 | |
| f | 120954 | 9.1% |
education
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.1 MiB |
| Bachelor | |
|---|---|
| Associate | |
| High School | |
| Master | |
| Doctorate | 3726 |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 8.769 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Master |
|---|---|
| 2nd row | High School |
| 3rd row | High School |
| 4th row | Bachelor |
| 5th row | Master |
Common Values
| Value | Count | Frequency (%) |
| Bachelor | 80394 | |
| Associate | 72168 | |
| High School | 71832 | |
| Master | 41880 | |
| Doctorate | 3726 | 1.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| bachelor | 80394 | |
| associate | 72168 | |
| high | 71832 | |
| school | 71832 | |
| master | 41880 | |
| doctorate | 3726 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 303678 | |
| c | 228120 | |
| h | 224058 | |
| e | 198168 | |
| a | 198168 | |
| s | 186216 | 7.9% |
| l | 152226 | 6.4% |
| i | 144000 | 6.1% |
| r | 126000 | 5.3% |
| t | 121500 | 5.1% |
| Other values (8) | 485496 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2367630 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 303678 | |
| c | 228120 | |
| h | 224058 | |
| e | 198168 | |
| a | 198168 | |
| s | 186216 | 7.9% |
| l | 152226 | 6.4% |
| i | 144000 | 6.1% |
| r | 126000 | 5.3% |
| t | 121500 | 5.1% |
| Other values (8) | 485496 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2367630 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 303678 | |
| c | 228120 | |
| h | 224058 | |
| e | 198168 | |
| a | 198168 | |
| s | 186216 | 7.9% |
| l | 152226 | 6.4% |
| i | 144000 | 6.1% |
| r | 126000 | 5.3% |
| t | 121500 | 5.1% |
| Other values (8) | 485496 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2367630 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 303678 | |
| c | 228120 | |
| h | 224058 | |
| e | 198168 | |
| a | 198168 | |
| s | 186216 | 7.9% |
| l | 152226 | 6.4% |
| i | 144000 | 6.1% |
| r | 126000 | 5.3% |
| t | 121500 | 5.1% |
| Other values (8) | 485496 |
income
Real number (ℝ)
Skewed 
| Distinct | 33989 |
|---|---|
| Distinct (%) | 12.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 80319.053 |
| Minimum | 8000 |
|---|---|
| Maximum | 7200766 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 8000 |
|---|---|
| 5-th percentile | 28366.7 |
| Q1 | 47204 |
| median | 67048 |
| Q3 | 95789.25 |
| 95-th percentile | 166754.7 |
| Maximum | 7200766 |
| Range | 7192766 |
| Interquartile range (IQR) | 48585.25 |
Descriptive statistics
| Standard deviation | 80421.754 |
|---|---|
| Coefficient of variation (CV) | 1.0012787 |
| Kurtosis | 2398.4626 |
| Mean | 80319.053 |
| Median Absolute Deviation (MAD) | 23124 |
| Skewness | 34.136635 |
| Sum | 2.1686144 × 1010 |
| Variance | 6.4676585 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8000 | 90 | < 0.1% |
| 73011 | 60 | < 0.1% |
| 36995 | 54 | < 0.1% |
| 60914 | 48 | < 0.1% |
| 37020 | 48 | < 0.1% |
| 73082 | 42 | < 0.1% |
| 60864 | 42 | < 0.1% |
| 67131 | 42 | < 0.1% |
| 72951 | 42 | < 0.1% |
| 73040 | 42 | < 0.1% |
| Other values (33979) | 269490 |
| Value | Count | Frequency (%) |
| 8000 | 90 | |
| 8037 | 6 | < 0.1% |
| 8104 | 6 | < 0.1% |
| 8186 | 6 | < 0.1% |
| 8248 | 6 | < 0.1% |
| 8267 | 6 | < 0.1% |
| 8277 | 6 | < 0.1% |
| 8302 | 6 | < 0.1% |
| 8518 | 6 | < 0.1% |
| 9364 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 7200766 | 6 | |
| 5556399 | 6 | |
| 5545545 | 6 | |
| 2448661 | 6 | |
| 2280980 | 6 | |
| 2139143 | 6 | |
| 2012954 | 6 | |
| 1741243 | 6 | |
| 1728974 | 6 | |
| 1661567 | 6 |
emp_exp
Real number (ℝ)
High correlation  Zeros 
| Distinct | 63 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.4103333 |
| Minimum | 0 |
|---|---|
| Maximum | 125 |
| Zeros | 57396 |
| Zeros (%) | 21.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 17 |
| Maximum | 125 |
| Range | 125 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 6.0634759 |
|---|---|
| Coefficient of variation (CV) | 1.1207213 |
| Kurtosis | 19.166438 |
| Mean | 5.4103333 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 2.5948453 |
| Sum | 1460790 |
| Variance | 36.765741 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 57396 | |
| 2 | 24804 | |
| 1 | 24366 | |
| 3 | 23340 | |
| 4 | 21144 | 7.8% |
| 5 | 18000 | 6.7% |
| 6 | 16302 | 6.0% |
| 7 | 13224 | 4.9% |
| 8 | 11340 | 4.2% |
| 9 | 9450 | 3.5% |
| Other values (53) | 50634 |
| Value | Count | Frequency (%) |
| 0 | 57396 | |
| 1 | 24366 | |
| 2 | 24804 | |
| 3 | 23340 | |
| 4 | 21144 | 7.8% |
| 5 | 18000 | 6.7% |
| 6 | 16302 | 6.0% |
| 7 | 13224 | 4.9% |
| 8 | 11340 | 4.2% |
| 9 | 9450 | 3.5% |
| Value | Count | Frequency (%) |
| 125 | 6 | |
| 124 | 6 | |
| 121 | 6 | |
| 101 | 6 | |
| 100 | 6 | |
| 93 | 6 | |
| 85 | 6 | |
| 76 | 6 | |
| 62 | 6 | |
| 61 | 6 |
home_ownership
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.1 MiB |
| RENT | |
|---|---|
| MORTGAGE | |
| OWN | |
| OTHER | 702 |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 5.5804889 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | RENT |
|---|---|
| 2nd row | OWN |
| 3rd row | MORTGAGE |
| 4th row | RENT |
| 5th row | RENT |
Common Values
| Value | Count | Frequency (%) |
| RENT | 140658 | |
| MORTGAGE | 110934 | |
| OWN | 17706 | 6.6% |
| OTHER | 702 | 0.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| rent | 140658 | |
| mortgage | 110934 | |
| own | 17706 | 6.6% |
| other | 702 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 252294 | |
| E | 252294 | |
| T | 252294 | |
| G | 221868 | |
| N | 158364 | |
| O | 129342 | |
| M | 110934 | |
| A | 110934 | |
| W | 17706 | 1.2% |
| H | 702 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1506732 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| R | 252294 | |
| E | 252294 | |
| T | 252294 | |
| G | 221868 | |
| N | 158364 | |
| O | 129342 | |
| M | 110934 | |
| A | 110934 | |
| W | 17706 | 1.2% |
| H | 702 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1506732 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| R | 252294 | |
| E | 252294 | |
| T | 252294 | |
| G | 221868 | |
| N | 158364 | |
| O | 129342 | |
| M | 110934 | |
| A | 110934 | |
| W | 17706 | 1.2% |
| H | 702 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1506732 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| R | 252294 | |
| E | 252294 | |
| T | 252294 | |
| G | 221868 | |
| N | 158364 | |
| O | 129342 | |
| M | 110934 | |
| A | 110934 | |
| W | 17706 | 1.2% |
| H | 702 | < 0.1% |
loan_amount
Real number (ℝ)
High correlation 
| Distinct | 4483 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9583.1576 |
| Minimum | 500 |
|---|---|
| Maximum | 35000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 500 |
|---|---|
| 5-th percentile | 2000 |
| Q1 | 5000 |
| median | 8000 |
| Q3 | 12237.25 |
| 95-th percentile | 24000 |
| Maximum | 35000 |
| Range | 34500 |
| Interquartile range (IQR) | 7237.25 |
Descriptive statistics
| Standard deviation | 6314.8282 |
|---|---|
| Coefficient of variation (CV) | 0.65895068 |
| Kurtosis | 1.350979 |
| Mean | 9583.1576 |
| Median Absolute Deviation (MAD) | 3800 |
| Skewness | 1.1796985 |
| Sum | 2.5874525 × 109 |
| Variance | 39877055 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10000 | 21702 | 8.0% |
| 5000 | 16722 | 6.2% |
| 6000 | 14556 | 5.4% |
| 12000 | 14496 | 5.4% |
| 15000 | 12024 | 4.5% |
| 8000 | 11568 | 4.3% |
| 4000 | 8436 | 3.1% |
| 20000 | 8310 | 3.1% |
| 3000 | 8268 | 3.1% |
| 7000 | 7884 | 2.9% |
| Other values (4473) | 146034 |
| Value | Count | Frequency (%) |
| 500 | 30 | |
| 563 | 6 | < 0.1% |
| 700 | 6 | < 0.1% |
| 725 | 6 | < 0.1% |
| 750 | 6 | < 0.1% |
| 800 | 6 | < 0.1% |
| 900 | 12 | < 0.1% |
| 912 | 6 | < 0.1% |
| 922 | 6 | < 0.1% |
| 950 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 35000 | 1404 | |
| 34826 | 6 | < 0.1% |
| 34800 | 6 | < 0.1% |
| 34664 | 6 | < 0.1% |
| 34375 | 6 | < 0.1% |
| 34322 | 6 | < 0.1% |
| 34121 | 6 | < 0.1% |
| 34000 | 24 | < 0.1% |
| 33950 | 12 | < 0.1% |
| 33800 | 6 | < 0.1% |
loan_intent
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.1 MiB |
| EDUCATION | |
|---|---|
| MEDICAL | |
| VENTURE | |
| PERSONAL | |
| DEBTCONSOLIDATION |
Length
| Max length | 17 |
|---|---|
| Median length | 15 |
| Mean length | 10.012711 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PERSONAL |
|---|---|
| 2nd row | EDUCATION |
| 3rd row | MEDICAL |
| 4th row | MEDICAL |
| 5th row | MEDICAL |
Common Values
| Value | Count | Frequency (%) |
| EDUCATION | 54918 | |
| MEDICAL | 51288 | |
| VENTURE | 46914 | |
| PERSONAL | 45312 | |
| DEBTCONSOLIDATION | 42870 | |
| HOMEIMPROVEMENT | 28698 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| education | 54918 | |
| medical | 51288 | |
| venture | 46914 | |
| personal | 45312 | |
| debtconsolidation | 42870 | |
| homeimprovement | 28698 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 374310 | |
| O | 286236 | |
| N | 261582 | |
| I | 220644 | |
| T | 216270 | |
| A | 194388 | 7.2% |
| D | 191946 | 7.1% |
| C | 149076 | 5.5% |
| L | 139470 | 5.2% |
| M | 137382 | 5.1% |
| Other values (7) | 532128 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2703432 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| E | 374310 | |
| O | 286236 | |
| N | 261582 | |
| I | 220644 | |
| T | 216270 | |
| A | 194388 | 7.2% |
| D | 191946 | 7.1% |
| C | 149076 | 5.5% |
| L | 139470 | 5.2% |
| M | 137382 | 5.1% |
| Other values (7) | 532128 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2703432 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| E | 374310 | |
| O | 286236 | |
| N | 261582 | |
| I | 220644 | |
| T | 216270 | |
| A | 194388 | 7.2% |
| D | 191946 | 7.1% |
| C | 149076 | 5.5% |
| L | 139470 | 5.2% |
| M | 137382 | 5.1% |
| Other values (7) | 532128 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2703432 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| E | 374310 | |
| O | 286236 | |
| N | 261582 | |
| I | 220644 | |
| T | 216270 | |
| A | 194388 | 7.2% |
| D | 191946 | 7.1% |
| C | 149076 | 5.5% |
| L | 139470 | 5.2% |
| M | 137382 | 5.1% |
| Other values (7) | 532128 |
interest_rate
Real number (ℝ)
| Distinct | 1302 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.006606 |
| Minimum | 5.42 |
|---|---|
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 5.42 |
|---|---|
| 5-th percentile | 6.17 |
| Q1 | 8.59 |
| median | 11.01 |
| Q3 | 12.99 |
| 95-th percentile | 16 |
| Maximum | 20 |
| Range | 14.58 |
| Interquartile range (IQR) | 4.4 |
Descriptive statistics
| Standard deviation | 2.9787807 |
|---|---|
| Coefficient of variation (CV) | 0.27063572 |
| Kurtosis | -0.4204075 |
| Mean | 11.006606 |
| Median Absolute Deviation (MAD) | 2.13 |
| Skewness | 0.21377813 |
| Sum | 2971783.6 |
| Variance | 8.8731344 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11.01 | 19974 | 7.4% |
| 10.99 | 4824 | 1.8% |
| 7.51 | 4788 | 1.8% |
| 7.49 | 4122 | 1.5% |
| 7.88 | 4038 | 1.5% |
| 5.42 | 3648 | 1.4% |
| 7.9 | 3636 | 1.3% |
| 11.49 | 3084 | 1.1% |
| 9.99 | 2904 | 1.1% |
| 13.49 | 2850 | 1.1% |
| Other values (1292) | 216132 |
| Value | Count | Frequency (%) |
| 5.42 | 3648 | |
| 5.43 | 12 | < 0.1% |
| 5.44 | 12 | < 0.1% |
| 5.46 | 6 | < 0.1% |
| 5.47 | 30 | < 0.1% |
| 5.48 | 24 | < 0.1% |
| 5.49 | 24 | < 0.1% |
| 5.5 | 6 | < 0.1% |
| 5.51 | 18 | < 0.1% |
| 5.52 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 20 | 504 | |
| 19.91 | 54 | < 0.1% |
| 19.9 | 6 | < 0.1% |
| 19.82 | 30 | < 0.1% |
| 19.8 | 6 | < 0.1% |
| 19.79 | 24 | < 0.1% |
| 19.74 | 24 | < 0.1% |
| 19.69 | 72 | < 0.1% |
| 19.66 | 18 | < 0.1% |
| 19.62 | 6 | < 0.1% |
percent_income
Real number (ℝ)
High correlation 
| Distinct | 64 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13972489 |
| Minimum | 0 |
|---|---|
| Maximum | 0.66 |
| Zeros | 162 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.03 |
| Q1 | 0.07 |
| median | 0.12 |
| Q3 | 0.19 |
| 95-th percentile | 0.31 |
| Maximum | 0.66 |
| Range | 0.66 |
| Interquartile range (IQR) | 0.12 |
Descriptive statistics
| Standard deviation | 0.0872115 |
|---|---|
| Coefficient of variation (CV) | 0.62416582 |
| Kurtosis | 1.0822049 |
| Mean | 0.13972489 |
| Median Absolute Deviation (MAD) | 0.05 |
| Skewness | 1.0344834 |
| Sum | 37725.72 |
| Variance | 0.0076058458 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.08 | 15558 | 5.8% |
| 0.1 | 14526 | 5.4% |
| 0.07 | 14490 | 5.4% |
| 0.09 | 13770 | 5.1% |
| 0.06 | 13452 | 5.0% |
| 0.12 | 13296 | 4.9% |
| 0.05 | 13056 | 4.8% |
| 0.11 | 12948 | 4.8% |
| 0.14 | 11760 | 4.4% |
| 0.04 | 11700 | 4.3% |
| Other values (54) | 135444 |
| Value | Count | Frequency (%) |
| 0 | 162 | 0.1% |
| 0.01 | 1890 | 0.7% |
| 0.02 | 5664 | 2.1% |
| 0.03 | 8928 | |
| 0.04 | 11700 | |
| 0.05 | 13056 | |
| 0.06 | 13452 | |
| 0.07 | 14490 | |
| 0.08 | 15558 | |
| 0.09 | 13770 |
| Value | Count | Frequency (%) |
| 0.66 | 6 | < 0.1% |
| 0.63 | 6 | < 0.1% |
| 0.62 | 12 | < 0.1% |
| 0.61 | 12 | < 0.1% |
| 0.59 | 6 | < 0.1% |
| 0.58 | 6 | < 0.1% |
| 0.57 | 6 | < 0.1% |
| 0.56 | 30 | |
| 0.55 | 30 | |
| 0.54 | 48 |
cred_hist_length
Real number (ℝ)
High correlation 
| Distinct | 29 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.8674889 |
| Minimum | 2 |
|---|---|
| Maximum | 30 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 14 |
| Maximum | 30 |
| Range | 28 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.8796659 |
|---|---|
| Coefficient of variation (CV) | 0.66121402 |
| Kurtosis | 3.7254884 |
| Mean | 5.8674889 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.6316746 |
| Sum | 1584222 |
| Variance | 15.051808 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 51918 | |
| 3 | 49872 | |
| 2 | 39222 | |
| 5 | 18492 | 6.8% |
| 6 | 17796 | 6.6% |
| 7 | 17334 | 6.4% |
| 8 | 16800 | 6.2% |
| 9 | 16110 | 6.0% |
| 10 | 14742 | 5.5% |
| 12 | 4290 | 1.6% |
| Other values (19) | 23424 |
| Value | Count | Frequency (%) |
| 2 | 39222 | |
| 3 | 49872 | |
| 4 | 51918 | |
| 5 | 18492 | 6.8% |
| 6 | 17796 | 6.6% |
| 7 | 17334 | 6.4% |
| 8 | 16800 | 6.2% |
| 9 | 16110 | 6.0% |
| 10 | 14742 | 5.5% |
| 11 | 4272 | 1.6% |
| Value | Count | Frequency (%) |
| 30 | 138 | |
| 29 | 90 | |
| 28 | 174 | |
| 27 | 138 | |
| 26 | 120 | |
| 25 | 138 | |
| 24 | 204 | |
| 23 | 156 | |
| 22 | 192 | |
| 21 | 144 |
credit_score
Real number (ℝ)
| Distinct | 340 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 632.60876 |
| Minimum | 390 |
|---|---|
| Maximum | 850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.1 MiB |
Quantile statistics
| Minimum | 390 |
|---|---|
| 5-th percentile | 539 |
| Q1 | 601 |
| median | 640 |
| Q3 | 670 |
| 95-th percentile | 703 |
| Maximum | 850 |
| Range | 460 |
| Interquartile range (IQR) | 69 |
Descriptive statistics
| Standard deviation | 50.435398 |
|---|---|
| Coefficient of variation (CV) | 0.079726051 |
| Kurtosis | 0.20289195 |
| Mean | 632.60876 |
| Median Absolute Deviation (MAD) | 33 |
| Skewness | -0.61024388 |
| Sum | 1.7080436 × 108 |
| Variance | 2543.7294 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 658 | 2436 | 0.9% |
| 649 | 2388 | 0.9% |
| 652 | 2376 | 0.9% |
| 663 | 2364 | 0.9% |
| 647 | 2358 | 0.9% |
| 650 | 2346 | 0.9% |
| 654 | 2346 | 0.9% |
| 667 | 2340 | 0.9% |
| 653 | 2340 | 0.9% |
| 656 | 2316 | 0.9% |
| Other values (330) | 246390 |
| Value | Count | Frequency (%) |
| 390 | 6 | < 0.1% |
| 418 | 6 | < 0.1% |
| 419 | 6 | < 0.1% |
| 420 | 6 | < 0.1% |
| 421 | 6 | < 0.1% |
| 430 | 6 | < 0.1% |
| 431 | 12 | |
| 434 | 6 | < 0.1% |
| 435 | 24 | |
| 437 | 12 |
| Value | Count | Frequency (%) |
| 850 | 6 | |
| 807 | 6 | |
| 805 | 6 | |
| 792 | 6 | |
| 789 | 6 | |
| 784 | 12 | |
| 773 | 6 | |
| 772 | 6 | |
| 770 | 6 | |
| 768 | 6 |
previous_defaults
Boolean
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 263.8 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 137148 | |
| False | 132852 |
loan_status
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.1 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 270000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 270000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 270000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 210000 | |
| 1 | 60000 | 22.2% |
Interactions
Correlations
| age | cred_hist_length | credit_score | education | emp_exp | gender | home_ownership | income | interest_rate | loan_amount | loan_id | loan_intent | loan_status | percent_income | person_id | previous_defaults | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| age | 1.000 | 0.821 | 0.160 | 0.061 | 0.888 | 0.026 | 0.019 | 0.143 | 0.013 | 0.064 | 0.081 | 0.033 | 0.017 | -0.056 | 0.081 | 0.032 |
| cred_hist_length | 0.821 | 1.000 | 0.142 | 0.092 | 0.750 | 0.029 | 0.031 | 0.093 | 0.017 | 0.043 | 0.085 | 0.055 | 0.024 | -0.037 | 0.085 | 0.029 |
| credit_score | 0.160 | 0.142 | 1.000 | 0.130 | 0.172 | 0.014 | 0.011 | 0.023 | 0.011 | 0.006 | 0.011 | 0.020 | 0.015 | -0.012 | 0.011 | 0.179 |
| education | 0.061 | 0.092 | 0.130 | 1.000 | 0.066 | 0.003 | 0.011 | 0.010 | 0.013 | 0.013 | 0.027 | 0.016 | 0.005 | 0.011 | 0.027 | 0.041 |
| emp_exp | 0.888 | 0.750 | 0.172 | 0.066 | 1.000 | 0.025 | 0.015 | 0.120 | 0.016 | 0.052 | 0.067 | 0.032 | 0.019 | -0.050 | 0.067 | 0.031 |
| gender | 0.026 | 0.029 | 0.014 | 0.003 | 0.025 | 1.000 | 0.000 | 0.013 | 0.009 | 0.014 | 0.000 | 0.006 | 0.000 | 0.010 | 0.000 | 0.000 |
| home_ownership | 0.019 | 0.031 | 0.011 | 0.011 | 0.015 | 0.000 | 1.000 | 0.013 | 0.085 | 0.091 | 0.041 | 0.083 | 0.258 | 0.092 | 0.041 | 0.140 |
| income | 0.143 | 0.093 | 0.023 | 0.010 | 0.120 | 0.013 | 0.013 | 1.000 | -0.033 | 0.405 | 0.017 | 0.014 | 0.013 | -0.353 | 0.017 | 0.013 |
| interest_rate | 0.013 | 0.017 | 0.011 | 0.013 | 0.016 | 0.009 | 0.085 | -0.033 | 1.000 | 0.105 | 0.003 | 0.021 | 0.363 | 0.124 | 0.003 | 0.198 |
| loan_amount | 0.064 | 0.043 | 0.006 | 0.013 | 0.052 | 0.014 | 0.091 | 0.405 | 0.105 | 1.000 | 0.011 | 0.033 | 0.126 | 0.666 | 0.011 | 0.068 |
| loan_id | 0.081 | 0.085 | 0.011 | 0.027 | 0.067 | 0.000 | 0.041 | 0.017 | 0.003 | 0.011 | 1.000 | 0.021 | 0.062 | -0.001 | 1.000 | 0.025 |
| loan_intent | 0.033 | 0.055 | 0.020 | 0.016 | 0.032 | 0.006 | 0.083 | 0.014 | 0.021 | 0.033 | 0.021 | 1.000 | 0.142 | 0.022 | 0.021 | 0.081 |
| loan_status | 0.017 | 0.024 | 0.015 | 0.005 | 0.019 | 0.000 | 0.258 | 0.013 | 0.363 | 0.126 | 0.062 | 0.142 | 1.000 | 0.415 | 0.062 | 0.543 |
| percent_income | -0.056 | -0.037 | -0.012 | 0.011 | -0.050 | 0.010 | 0.092 | -0.353 | 0.124 | 0.666 | -0.001 | 0.022 | 0.415 | 1.000 | -0.001 | 0.220 |
| person_id | 0.081 | 0.085 | 0.011 | 0.027 | 0.067 | 0.000 | 0.041 | 0.017 | 0.003 | 0.011 | 1.000 | 0.021 | 0.062 | -0.001 | 1.000 | 0.025 |
| previous_defaults | 0.032 | 0.029 | 0.179 | 0.041 | 0.031 | 0.000 | 0.140 | 0.013 | 0.198 | 0.068 | 0.025 | 0.081 | 0.543 | 0.220 | 0.025 | 1.000 |
Missing values
Sample
| loan_id | person_id | age | gender | education | income | emp_exp | home_ownership | loan_amount | loan_intent | interest_rate | percent_income | cred_hist_length | credit_score | previous_defaults | loan_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | 22 | female | Master | 71948.0 | 0 | RENT | 35000.0 | PERSONAL | 16.02 | 0.49 | 3.0 | 561 | No | 1 |
| 1 | 2 | 2 | 21 | female | High School | 12282.0 | 0 | OWN | 1000.0 | EDUCATION | 11.14 | 0.08 | 2.0 | 504 | Yes | 0 |
| 2 | 3 | 3 | 25 | female | High School | 12438.0 | 3 | MORTGAGE | 5500.0 | MEDICAL | 12.87 | 0.44 | 3.0 | 635 | No | 1 |
| 3 | 4 | 4 | 23 | female | Bachelor | 79753.0 | 0 | RENT | 35000.0 | MEDICAL | 15.23 | 0.44 | 2.0 | 675 | No | 1 |
| 4 | 5 | 5 | 24 | male | Master | 66135.0 | 1 | RENT | 35000.0 | MEDICAL | 14.27 | 0.53 | 4.0 | 586 | No | 1 |
| 5 | 6 | 6 | 21 | female | High School | 12951.0 | 0 | OWN | 2500.0 | VENTURE | 7.14 | 0.19 | 2.0 | 532 | No | 1 |
| 6 | 7 | 7 | 26 | female | Bachelor | 93471.0 | 1 | RENT | 35000.0 | EDUCATION | 12.42 | 0.37 | 3.0 | 701 | No | 1 |
| 7 | 8 | 8 | 24 | female | High School | 95550.0 | 5 | RENT | 35000.0 | MEDICAL | 11.11 | 0.37 | 4.0 | 585 | No | 1 |
| 8 | 9 | 9 | 24 | female | Associate | 100684.0 | 3 | RENT | 35000.0 | PERSONAL | 8.90 | 0.35 | 2.0 | 544 | No | 1 |
| 9 | 10 | 10 | 21 | female | High School | 12739.0 | 0 | OWN | 1600.0 | VENTURE | 14.74 | 0.13 | 3.0 | 640 | No | 1 |
| loan_id | person_id | age | gender | education | income | emp_exp | home_ownership | loan_amount | loan_intent | interest_rate | percent_income | cred_hist_length | credit_score | previous_defaults | loan_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 269990 | 269991 | 269991 | 31 | male | Master | 136832.0 | 9 | RENT | 12319.0 | PERSONAL | 16.92 | 0.09 | 7.0 | 722 | No | 1 |
| 269991 | 269992 | 269992 | 24 | male | High School | 37786.0 | 0 | MORTGAGE | 13500.0 | EDUCATION | 13.43 | 0.36 | 4.0 | 612 | No | 1 |
| 269992 | 269993 | 269993 | 23 | female | Bachelor | 40925.0 | 0 | RENT | 9000.0 | PERSONAL | 11.01 | 0.22 | 4.0 | 487 | No | 1 |
| 269993 | 269994 | 269994 | 27 | female | High School | 35512.0 | 4 | RENT | 5000.0 | PERSONAL | 15.83 | 0.14 | 5.0 | 505 | No | 1 |
| 269994 | 269995 | 269995 | 24 | female | Associate | 31924.0 | 2 | RENT | 12229.0 | MEDICAL | 10.70 | 0.38 | 4.0 | 678 | No | 1 |
| 269995 | 269996 | 269996 | 27 | male | Associate | 47971.0 | 6 | RENT | 15000.0 | MEDICAL | 15.66 | 0.31 | 3.0 | 645 | No | 1 |
| 269996 | 269997 | 269997 | 37 | female | Associate | 65800.0 | 17 | RENT | 9000.0 | HOMEIMPROVEMENT | 14.07 | 0.14 | 11.0 | 621 | No | 1 |
| 269997 | 269998 | 269998 | 33 | male | Associate | 56942.0 | 7 | RENT | 2771.0 | DEBTCONSOLIDATION | 10.02 | 0.05 | 10.0 | 668 | No | 1 |
| 269998 | 269999 | 269999 | 29 | male | Bachelor | 33164.0 | 4 | RENT | 12000.0 | EDUCATION | 13.23 | 0.36 | 6.0 | 604 | No | 1 |
| 269999 | 270000 | 270000 | 24 | male | High School | 51609.0 | 1 | RENT | 6665.0 | DEBTCONSOLIDATION | 17.05 | 0.13 | 3.0 | 628 | No | 1 |